Skip to content

[After 2.7] Fix SonarCloud issues#381

Open
cshiels-ie wants to merge 12 commits intoansible:develfrom
cshiels-ie:Sonar-Issues
Open

[After 2.7] Fix SonarCloud issues#381
cshiels-ie wants to merge 12 commits intoansible:develfrom
cshiels-ie:Sonar-Issues

Conversation

@cshiels-ie
Copy link
Copy Markdown
Contributor

@cshiels-ie cshiels-ie commented Apr 27, 2026

Summary

  • Fetched 182 open SonarCloud issues from ansible/metrics-utility project
  • ~65 fixes across 4 commits grouped by severity
  • BUG/BLOCKERs: fixed broken super() comparison (python:S3403), method name clash (python:S1845)
  • CRITICAL CODE_SMELLs: empty methods (S1186), identity checks in assertions (S5727), lambda closure over loop variable (S1515), duplicate string literals (S1192), redundant Exception subclass (S5713), missing validate param in pd.merge (S6735)
  • MAJOR CODE_SMELLs: generic Exception replaced (S112), dead commented-out code removed (S125), unused parameters fixed (S1172), duplicate if branches merged (S1871)
  • MINOR CODE_SMELLs: redundant list() before sorted() (S7508), regex character class simplification (S6353)
  • ~117 issues skipped: cognitive complexity refactors (S3776), type-narrowing issues requiring design decisions, FIXMEs

Test plan

  • SonarCloud scan on this PR branch to verify issue count reduction
  • Run existing test suite to confirm no regressions
  • Review skipped issues for follow-up

🤖 Generated with Claude Code


Note

Medium Risk
Touches core rollup merge logic (pd.merge validation) and packaging/shipping configuration checks, which could surface previously-silent data issues or alter runtime failure modes. Changes are mostly mechanical/defensive with expanded test coverage, but affect critical aggregation and upload paths.

Overview
Primarily refactors and hardens metrics/billing utilities to satisfy SonarCloud findings, including safer pandas merge semantics and schema reuse across dataframe implementations.

Key behavior changes include: enforcing pd.merge(..., validate='one_to_one') for rollup merges, fixing lambda capture-over-loop-variable in merge summarization, and centralizing job-host-summary dataframe schema via new DataframeSchemaMixin/JobHostSummarySchema (plus shared constants in metric_utils).

Also cleans up/standardizes error handling and APIs: replaces generic Exception with TypeError/ValueError/RuntimeError in collectors/storage, fixes PackageCRC.is_shipping_configured() to call super().is_shipping_configured(), renames Package.max_data_size() to Package.get_max_data_size(), simplifies set/list sorting unions, and tweaks report/test helpers and coverage (new targeted unit tests and consolidated report sanity checks).

Reviewed by Cursor Bugbot for commit 18dffa5. Bugbot is set up for automated code reviews on this repo. Configure here.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 27, 2026

📝 Walkthrough

Walkthrough

Consolidates dataframe schema definitions via a new DataframeSchemaMixin and JobHostSummarySchema, simplifies set-to-list conversions when sorting, tightens exception types across storage/collector modules, renames unused parameters, adjusts merge/aggregation lambdas to bind loop variables, and adds/updates extensive tests for validation and merge behavior.

Changes

Cohort / File(s) Summary
Anonymized Rollups - Sorting & JSON handling
metrics_utility/anonymized_rollups/anonymized_rollups.py, metrics_utility/anonymized_rollups/base_anonymized_rollup.py, metrics_utility/anonymized_rollups/credentials_anonymized_rollup.py, metrics_utility/anonymized_rollups/events_modules_anonymized_rollup.py
Pass sets directly to sorted() (remove intermediate list conversions); consolidate JSON sanitization/serialization in base; JSON parsing helper no longer swallows json.JSONDecodeError.
Dataframe Schema Consolidation
metrics_utility/dataframe_schema.py, metrics_utility/metric_utils.py, metrics_utility/library/dataframes/base_traditional.py, metrics_utility/library/dataframes/job_host_summary.py, metrics_utility/automation_controller_billing/dataframe_engine/base.py, metrics_utility/automation_controller_billing/dataframe_engine/dataframe_jobhost_summary_usage.py
Add DataframeSchemaMixin and JobHostSummarySchema and move schema stubs/constants there; refactor classes to inherit mixin; update merge/summarize lambdas to capture col via default arg; pd.merge(..., validate='one_to_one') added.
Exception Specificity in Storage & Collectors
metrics_utility/library/storage/crc.py, metrics_utility/library/storage/directory.py, metrics_utility/library/storage/s3.py, metrics_utility/library/storage/segment.py, metrics_utility/library/collectors/others/total_workers_vcpu.py, metrics_utility/library/collectors/util.py
Replace generic Exception with specific types (ValueError, TypeError, RuntimeError) for configuration, validation, type, and HTTP/API errors.
Report & Utility Adjustments
metrics_utility/automation_controller_billing/report/base.py, metrics_utility/automation_controller_billing/report/report_ccsp_v2.py, metrics_utility/automation_controller_billing/report_saver/report_saver_s3.py, metrics_utility/automation_controller_billing/dedup/renewal_guidance.py, metrics_utility/automation_controller_billing/package/package_crc.py, metrics_utility/base/package.py
Simplify set sorting, remove commented styling, adjust super() call to specific base, avoid intermediate lists in stringify, rename max_data_size()get_max_data_size(), and tweak report saver file-listing expression.
Extractors, Utils, Management, Collectors
metrics_utility/library/extractors.py, metrics_utility/library/utils.py, metrics_utility/library/storage/util.py, metrics_utility/management_utility.py, metrics_utility/library/collectors/controller/main_jobevent_service.py
Rename unused params to _-prefixed names, replace some comprehensions with literal/set comprehensions or dict.fromkeys, and remove commented examples.
Tests — tarfile membership & param renames
metrics_utility/test/base/functional/test_gathering.py, metrics_utility/test/base/functional/test_slicing.py, metrics_utility/test/base/functional/helpers.py
Use filename in files instead of in files.keys(); rename unused test params to _-prefixed names.
Tests — Validation, Merge, and Schema
metrics_utility/test/base/test_package.py, metrics_utility/test/library/test_coverage_gaps.py, metrics_utility/test/library/test_dataframe_stubs.py, metrics_utility/test/library/test_storage_crc.py, metrics_utility/test/library/test_storage_directory.py, metrics_utility/test/library/test_storage_s3.py, metrics_utility/test/library/test_storage_segment.py, metrics_utility/test/library/test_collectors_prometheus_client.py, metrics_utility/test/library/test_collectors_util.py, metrics_utility/test/library/test_summarize_merge_ops.py
Add numerous tests: schema mixin behavior, storage constructor validation, RuntimeError expectations for Prometheus failures, collector util type errors, summarize/merge operations including one_to_one merge validation, and package size checks.
Report Tests & Conftest helper
metrics_utility/test/ccspv_reports/..., metrics_utility/test/conftest.py
Introduce run_report_sanity_check helper, simplify report tests to use it, assert workbooks have sheets, change mocked extractor to return concrete batch lists.
Misc Tests & Cleanups
metrics_utility/test/dedup/*, metrics_utility/test/test_anonymized_rollups/*, metrics_utility/test/test_helpers.py
Remove commented assertions, simplify set/sorting expressions in tests, tighten JSON serialization assertion to non-empty string.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 2 | ❌ 3

❌ Failed checks (2 warnings, 1 inconclusive)

Check name Status Explanation Resolution
Description check ⚠️ Warning The description provides a detailed summary of changes, severity groupings, behavioral impacts, and test plans, but is missing the required AAP issue link, structured testing section (prerequisites/steps), and self-review checklist. Add AAP issue link at the top, include structured testing prerequisites and reproduction steps, and complete the self-review checklist items per the template.
Docstring Coverage ⚠️ Warning Docstring coverage is 20.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check ❓ Inconclusive The title 'Fix SonarCloud issues' is partially related to the changeset—it refers to a real aspect of the change (SonarCloud issue fixes are a major component), but it lacks specificity and does not highlight the most important changes from the developer's perspective. Consider a more specific title that highlights the primary changes, such as 'Fix SonarCloud issues: exception types, lambda closures, and merge validation' or similar.
✅ Passed checks (2 passed)
Check name Status Explanation
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

Reviewed by Cursor Bugbot for commit 320813a. Configure here.

Comment thread metrics_utility/base/utils.py Outdated
Comment thread metrics_utility/anonymized_rollups/events_modules_anonymized_rollup.py Outdated
cshiels-ie and others added 4 commits April 27, 2026 11:59
…45 method name clash

- Fix python:S3403 (BLOCKER BUG) in package_crc.py: `super()` was compared
  with `is False` which always evaluated to False since `super()` returns a
  proxy object, not a boolean. Changed to call `super().is_shipping_configured()`
  and check `not ret`.
- Fix python:S1845 (BLOCKER CODE_SMELL) in base/package.py: Renamed classmethod
  `max_data_size` to `get_max_data_size` to avoid naming clash/confusion with the
  class attribute `MAX_DATA_SIZE`.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…, S5713, S6735

- Fix python:S1186 (empty methods) in base_traditional.py and dataframe_engine/base.py:
  Added explanatory comments and changed pass to meaningful return values in
  stub static methods (cast_types, data_columns, operations, unique_index_columns,
  build_dataframe).
- Fix python:S5727 (identity checks always True) in test files: Replaced
  assert workbook is not None (openpyxl never returns None) with meaningful
  assertions (assert len(workbook.sheetnames) > 0). Fixed assert json_string is
  not None with assert len(json_string) > 0.
- Fix python:S1192 (duplicate string literals) in job_host_summary.py and
  dataframe_jobhost_summary_usage.py: Extracted 'datetime64[ns]' into a
  DATETIME64_NS constant.
- Fix python:S1515 (lambda closure over loop variable) in base_traditional.py,
  dataframe_engine/base.py, and renewal_guidance.py: Added default parameter
  c=col / s=serial to bind loop variable at lambda definition time.
- Fix python:S5713 (redundant Exception subclass in except tuple) in
  events_modules_anonymized_rollup.py and test_complex_CCSPv2_with_canonical_facts.py:
  Removed json.JSONDecodeError since it is a subclass of ValueError.
- Fix python:S6735 (missing validate parameter on pd.merge) in base_traditional.py
  and dataframe_engine/base.py: Added validate='many_to_many'.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…81, S1481

- Fix python:S112 (generic Exception class) in total_workers_vcpu.py, util.py,
  storage/crc.py, storage/segment.py, storage/directory.py, storage/s3.py:
  Replaced bare Exception raises with more specific RuntimeError, TypeError,
  or ValueError as appropriate.
- Fix python:S125 (commented-out code) in base_dataframe.py, job_host_summary.py,
  storage/util.py, test_factory.py, report/base.py, report_ccsp_v2.py, and
  dataframe_jobhost_summary_usage.py: Removed dead commented code blocks.
- Fix python:S1172 (unused function parameters) in extractors.py (local -> _local),
  utils.py (last_gather/save_last_gather now use **_kwargs), helpers.py
  (key,last_gather -> _key,_last_gather), base_anonymized_rollup.py
  (dataframe -> _dataframe), report/base.py (removed unused mode parameter).
- Fix python:S1871 (duplicate if-elif branches) in base_anonymized_rollup.py:
  Merged identical list and dict branches into a single isinstance(value, (list,dict)) check.
- Fix python:S5781 (duplicate set literal entry) in test_renewal_guidance.py:
  Removed duplicate None from set literal.
- Fix python:S1481 (unused loop variable) in renewal_guidance.py: Replaced i with _.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
…8521, S7519, S7494

- Fix python:S7508 (redundant list() before sorted()) in anonymized_rollups.py,
  credentials_anonymized_rollup.py, events_modules_anonymized_rollup.py,
  test_jobs_anonymized_rollups.py, test_multiple_files.py, report/base.py:
  Removed unnecessary list() wrapping since sorted() accepts any iterable.
- Fix python:S6353 (use \w instead of [A-Za-z0-9_]) in events_modules_anonymized_rollup.py:
  Simplified regex patterns to use \w shorthand.
- Fix python:S7504 (unnecessary list() on iterable) in renewal_guidance.py:
  Removed redundant list() in list comprehension iteration.
- Fix python:S7500 (replace comprehension with collection constructor) in
  conftest.py and report_saver_s3.py: Simplified unnecessary comprehension wrappers.
- Fix python:S8521 (unnecessary .keys() call) in test_gathering.py and test_slicing.py:
  Removed .keys() calls since dict membership testing is implicit.
- Fix python:S7519 (use dict.fromkeys) in management_utility.py: Replaced dict
  comprehension with dict.fromkeys() call.
- Fix python:S7494 (use set comprehension) in main_jobevent_service.py: Replaced
  set() constructor with generator with a set comprehension literal.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@cshiels-ie cshiels-ie marked this pull request as draft April 27, 2026 12:06
cshiels-ie and others added 6 commits April 27, 2026 13:09
… pattern

In Python 3, \w matches Unicode word characters (accented letters, CJK, etc.)
by default. The original explicit character class [A-Za-z0-9_] correctly
restricts matching to ASCII Ansible collection name characters only.

Addresses Cursor Bugbot review comment on PR ansible#381.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
- Introduced a new constant DATETIME64_NS in metric_utils.py for consistent datetime representation.
- Removed commented-out code in Base class methods in base.py and base_traditional.py to enhance clarity and maintainability.
- Updated imports in dataframe_jobhost_summary_usage.py and job_host_summary.py to utilize the new constant.

This commit improves code readability and standardizes datetime handling across the project.
- Introduced new tests to validate that DictOutput raises a TypeError when provided with non-dict input.
- Added a test for CollectionOutput to ensure it raises a TypeError when given non-list input.
- Updated imports in test_collectors_util.py to include the new output classes.

This commit enhances the robustness of the utility by ensuring proper error handling for invalid input types.
- Extract shared sanity check logic into run_report_sanity_check() in
  conftest.py, eliminating the identical test_command body duplicated
  across test_complex_CCSP_sanity_check.py and
  test_complex_CCSPv2_sanity_check.py
- Replace hardcoded /tmp path with pytest tmp_path fixture in
  test_collection_output_raises_type_error_for_non_list to avoid
  world-writable directory security warning

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
The 4 static schema methods (unique_index_columns, data_columns,
cast_types, operations) were character-identical between
DataframeJobhostSummaryUsage and DataframeJobHostSummary, causing
SonarCloud to report 75% duplication on new code in
dataframe_jobhost_summary_usage.py.

Move the method return values to shared constants in metric_utils.py
(JOB_HOST_SUMMARY_INDEX_COLUMNS, JOB_HOST_SUMMARY_DATA_COLUMNS,
JOB_HOST_SUMMARY_CAST_TYPES, JOB_HOST_SUMMARY_OPERATIONS) and replace
the multi-line method bodies with single-line constant references in
both classes.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Duplication (5.4% → target ≤3%):
- Introduce DataframeSchemaMixin in dataframe_schema.py with the 4 stub
  methods (cast_types/data_columns/operations/unique_index_columns)
- Introduce JobHostSummarySchema(DataframeSchemaMixin) with the shared
  job-host-summary schema overrides
- Base and BaseTraditional inherit from DataframeSchemaMixin; removing
  the duplicated stub bodies from both files
- DataframeJobhostSummaryUsage and DataframeJobHostSummary inherit from
  JobHostSummarySchema; removing the duplicated one-liner bodies from
  both files

Coverage (78.1% → target ≥90%):
- test_dataframe_stubs.py: cover DataframeSchemaMixin and
  JobHostSummarySchema directly in dataframe_schema.py
- test_summarize_merge_ops.py: cover combine_set/combine_json/
  combine_json_values lambda branches and validate='many_to_many' path
  in both Base.merge() and BaseTraditional.merge()
- test_coverage_gaps.py: cover BaseAnonymizedRollup.base(),
  ManagementUtility.get_commands(), ReportSaverS3.report_exist(),
  StorageCRC._put() RuntimeError, EventModulesAnonymizedRollup
  host_ids branch, PackageCRC.is_shipping_configured() super() call

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@cshiels-ie cshiels-ie marked this pull request as ready for review April 27, 2026 14:07
@cshiels-ie
Copy link
Copy Markdown
Contributor Author

/cursor review

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (5)
metrics_utility/test/library/test_summarize_merge_ops.py (2)

57-91: Merge tests pass but don't exercise many-to-many semantics.

Both tests use disjoint single-row indexes, which is effectively a one-to-one outer join — they confirm merge() doesn't crash and returns 2 rows but don't actually validate the many-to-many path. Fine to keep as a smoke test for the SonarCloud fix; just noting that duplicate keys on both sides would more directly cover the validate parameter behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics_utility/test/library/test_summarize_merge_ops.py` around lines 57 -
91, Update the two tests (test_base_traditional_merge_many_to_many and
test_base_engine_merge_many_to_many) so they use duplicate keys on both sides to
actually exercise many-to-many semantics: build df1 and df2 with repeated 'id'
values (e.g., id duplicated) so the merge triggers the many-to-many path in
Base.merge()/ _ConcreteTraditional.merge and assert the expected expanded row
count (e.g., 4 for two duplicates on each side) instead of using disjoint
single-row indexes.

30-33: Weak assertion for combine_json_values.

The test only verifies key 'k' is present, not that values from both inputs are actually combined. Given combine_json_values builds a set per key, a stronger assertion exercises the real behavior:

♻️ Proposed stronger assertion
-def test_summarize_combine_json_values():
-    df = pd.DataFrame({'col_x': [{'k': 'v1'}], 'col_y': [{'k': 'v2'}]})
-    result = _bt().summarize_merged_dataframes(df, ['col'], operations={'col': 'combine_json_values'})
-    assert 'k' in result['col'].iloc[0]
+def test_summarize_combine_json_values():
+    df = pd.DataFrame({'col_x': [{'k': 'v1'}], 'col_y': [{'k': 'v2'}]})
+    result = _bt().summarize_merged_dataframes(df, ['col'], operations={'col': 'combine_json_values'})
+    assert result['col'].iloc[0] == {'k': {'v1', 'v2'}}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics_utility/test/library/test_summarize_merge_ops.py` around lines 30 -
33, The test test_summarize_combine_json_values has a weak assertion that only
checks the presence of key 'k' but not that values from both input dicts were
combined; update the assertion on the result of
_bt().summarize_merged_dataframes (operation 'combine_json_values') to assert
that the value for 'k' contains both 'v1' and 'v2' (e.g., that the set or list
at result['col'].iloc[0]['k'] equals or includes both values) so the test
verifies the actual combining behavior.
metrics_utility/library/dataframes/base_traditional.py (1)

14-14: Mixin defaults silently return empty schemas.

DataframeSchemaMixin returns {}/[] for all schema methods, so any subclass of BaseTraditional that forgets to override (e.g.) unique_index_columns() will silently merge/cast on empty keys and produce wrong rollups instead of failing fast. The previous local placeholders presumably had the same property, so this isn't a regression — but worth either documenting the contract on the mixin or having the defaults raise NotImplementedError (or using abc.abstractmethod) to surface mistakes early.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics_utility/library/dataframes/base_traditional.py` at line 14,
DataframeSchemaMixin currently returns empty dicts/lists which lets subclasses
like BaseTraditional silently use empty schemas; change the mixin so its schema
accessor methods (e.g., unique_index_columns, index_columns, schema_fields —
whatever schema methods exist on DataframeSchemaMixin) either raise
NotImplementedError by default or, better, mark them as abstract using
abc.abstractmethod so concrete subclasses must implement them; update
DataframeSchemaMixin accordingly and ensure BaseTraditional (and its subclasses)
implement the required methods to avoid silent empty-schema behavior.
metrics_utility/automation_controller_billing/report_saver/report_saver_s3.py (1)

19-19: Avoid materializing entire iterable just to check non-emptiness.

If list_files returns an iterator/generator over potentially many S3 objects, list(...) walks all of them. Using any() short-circuits on the first item.

♻️ Proposed refactor
-        return len(list(self.s3_handler.list_files(self.report_spreadsheet_destination_path))) > 0
+        return any(True for _ in self.s3_handler.list_files(self.report_spreadsheet_destination_path))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@metrics_utility/automation_controller_billing/report_saver/report_saver_s3.py`
at line 19, The current check materializes the entire iterator by doing
len(list(self.s3_handler.list_files(self.report_spreadsheet_destination_path)))
> 0; replace this with a short-circuiting check using
any(self.s3_handler.list_files(self.report_spreadsheet_destination_path)) so the
generator/iterator returned by list_files is not fully consumed (update the
return expression in the method that contains this line).
metrics_utility/test/library/test_storage_s3.py (1)

92-94: LGTM — covers the new ValueError contract.

Asserting both the exception type and match='bucket not set' pins down the public failure mode of StorageS3() without a bucket.

One small consideration: the message in s3.py is 'StorageS3: bucket not set' while other tests in the suite (e.g. test_storage_directory.py) likely match the prefixed form. Worth verifying the match regex is the intended substring across the storage test family for consistency.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics_utility/test/library/test_storage_s3.py` around lines 92 - 94, The
test test_missing_bucket_raises_value_error asserts the error message is 'bucket
not set' but the implementation in s3.py raises "StorageS3: bucket not set",
causing an inconsistency with other storage tests; update the
test_missing_bucket_raises_value_error to match the actual prefixed message
(e.g., use match='StorageS3: bucket not set' or a regex that accepts the prefix)
or alternatively change the raise message in StorageS3 (in s3.py) to the
unprefixed form—ensure consistency across StorageS3, the test function
test_missing_bucket_raises_value_error, and other storage tests like
test_storage_directory.py.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@metrics_utility/library/dataframes/base_traditional.py`:
- Line 79: The merge call using pd.merge(rollup, new_group,
on=self.unique_index_columns(), how='outer', validate='many_to_many') should
enforce uniqueness of the merge keys; change validate from 'many_to_many' to
'one_to_one' so that duplicates in rollup or new_group (which should have been
prevented by dedup()/regroup()) raise an error and prevent silent data
corruption; locate the pd.merge invocation that references rollup, new_group and
self.unique_index_columns() and update its validate argument accordingly,
keeping the outer merge semantics if that behavior is required.

In `@metrics_utility/library/storage/segment.py`:
- Around line 108-110: The conditional in StorageSegment that reads "if filename
or fileobj or dict is None:" conflates missing filename/fileobj with a missing
dict and produces a misleading message; change the check to explicitly test for
(filename is None and fileobj is None) and separately for (dict is None) so you
can raise distinct ValueError messages—one saying "StorageSegment: filename and
fileobj both missing; provide filename or fileobj" and the other saying
"StorageSegment: dict missing; use dict="—so locate the check in the
StorageSegment constructor or the function that validates inputs (references:
filename, fileobj, dict) and split the conditions accordingly to raise the
appropriate message for each missing parameter case.

In `@metrics_utility/test/library/test_storage_segment.py`:
- Around line 154-157: The test name and behavior are mismatched: update the
test in test_storage_segment.py so it actually verifies the "no dict" branch of
StorageSegment.put instead of the filename branch—remove the filename='foo.json'
argument from the call to storage.put(artifact_name='test', ...) so the call
exercises the missing dict path and keeps the pytest.raises(ValueError,
match='use dict=') assertion; alternatively, if you want to lock both branches,
add a separate test named test_put_with_filename_raises_value_error that passes
filename='foo.json' and asserts the same ValueError for the unsupported filename
branch.

---

Nitpick comments:
In
`@metrics_utility/automation_controller_billing/report_saver/report_saver_s3.py`:
- Line 19: The current check materializes the entire iterator by doing
len(list(self.s3_handler.list_files(self.report_spreadsheet_destination_path)))
> 0; replace this with a short-circuiting check using
any(self.s3_handler.list_files(self.report_spreadsheet_destination_path)) so the
generator/iterator returned by list_files is not fully consumed (update the
return expression in the method that contains this line).

In `@metrics_utility/library/dataframes/base_traditional.py`:
- Line 14: DataframeSchemaMixin currently returns empty dicts/lists which lets
subclasses like BaseTraditional silently use empty schemas; change the mixin so
its schema accessor methods (e.g., unique_index_columns, index_columns,
schema_fields — whatever schema methods exist on DataframeSchemaMixin) either
raise NotImplementedError by default or, better, mark them as abstract using
abc.abstractmethod so concrete subclasses must implement them; update
DataframeSchemaMixin accordingly and ensure BaseTraditional (and its subclasses)
implement the required methods to avoid silent empty-schema behavior.

In `@metrics_utility/test/library/test_storage_s3.py`:
- Around line 92-94: The test test_missing_bucket_raises_value_error asserts the
error message is 'bucket not set' but the implementation in s3.py raises
"StorageS3: bucket not set", causing an inconsistency with other storage tests;
update the test_missing_bucket_raises_value_error to match the actual prefixed
message (e.g., use match='StorageS3: bucket not set' or a regex that accepts the
prefix) or alternatively change the raise message in StorageS3 (in s3.py) to the
unprefixed form—ensure consistency across StorageS3, the test function
test_missing_bucket_raises_value_error, and other storage tests like
test_storage_directory.py.

In `@metrics_utility/test/library/test_summarize_merge_ops.py`:
- Around line 57-91: Update the two tests
(test_base_traditional_merge_many_to_many and
test_base_engine_merge_many_to_many) so they use duplicate keys on both sides to
actually exercise many-to-many semantics: build df1 and df2 with repeated 'id'
values (e.g., id duplicated) so the merge triggers the many-to-many path in
Base.merge()/ _ConcreteTraditional.merge and assert the expected expanded row
count (e.g., 4 for two duplicates on each side) instead of using disjoint
single-row indexes.
- Around line 30-33: The test test_summarize_combine_json_values has a weak
assertion that only checks the presence of key 'k' but not that values from both
input dicts were combined; update the assertion on the result of
_bt().summarize_merged_dataframes (operation 'combine_json_values') to assert
that the value for 'k' contains both 'v1' and 'v2' (e.g., that the set or list
at result['col'].iloc[0]['k'] equals or includes both values) so the test
verifies the actual combining behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 2cb04212-5a18-4677-8cb1-aed3c90f9fd9

📥 Commits

Reviewing files that changed from the base of the PR and between 828c8c1 and bcc179e.

📒 Files selected for processing (52)
  • metrics_utility/anonymized_rollups/anonymized_rollups.py
  • metrics_utility/anonymized_rollups/base_anonymized_rollup.py
  • metrics_utility/anonymized_rollups/credentials_anonymized_rollup.py
  • metrics_utility/anonymized_rollups/events_modules_anonymized_rollup.py
  • metrics_utility/automation_controller_billing/dataframe_engine/base.py
  • metrics_utility/automation_controller_billing/dataframe_engine/dataframe_jobhost_summary_usage.py
  • metrics_utility/automation_controller_billing/dedup/renewal_guidance.py
  • metrics_utility/automation_controller_billing/package/package_crc.py
  • metrics_utility/automation_controller_billing/report/base.py
  • metrics_utility/automation_controller_billing/report/report_ccsp_v2.py
  • metrics_utility/automation_controller_billing/report_saver/report_saver_s3.py
  • metrics_utility/base/package.py
  • metrics_utility/dataframe_schema.py
  • metrics_utility/library/collectors/controller/main_jobevent_service.py
  • metrics_utility/library/collectors/others/total_workers_vcpu.py
  • metrics_utility/library/collectors/util.py
  • metrics_utility/library/dataframes/base_dataframe.py
  • metrics_utility/library/dataframes/base_traditional.py
  • metrics_utility/library/dataframes/job_host_summary.py
  • metrics_utility/library/extractors.py
  • metrics_utility/library/storage/crc.py
  • metrics_utility/library/storage/directory.py
  • metrics_utility/library/storage/s3.py
  • metrics_utility/library/storage/segment.py
  • metrics_utility/library/storage/util.py
  • metrics_utility/library/utils.py
  • metrics_utility/management_utility.py
  • metrics_utility/metric_utils.py
  • metrics_utility/test/base/functional/helpers.py
  • metrics_utility/test/base/functional/test_gathering.py
  • metrics_utility/test/base/functional/test_slicing.py
  • metrics_utility/test/base/test_package.py
  • metrics_utility/test/ccspv_reports/dedup/test_complex_CCSPv2_with_canonical_facts/test_complex_CCSPv2_with_canonical_facts.py
  • metrics_utility/test/ccspv_reports/test_complex_CCSP_sanity_check.py
  • metrics_utility/test/ccspv_reports/test_complex_CCSPv2_sanity_check.py
  • metrics_utility/test/ccspv_reports/test_empty_data_for_CCSP_and_CCSPv2.py
  • metrics_utility/test/ccspv_reports/test_invalid_data_for_CCSP_and_CCSPv2.py
  • metrics_utility/test/conftest.py
  • metrics_utility/test/dedup/test_factory.py
  • metrics_utility/test/dedup/test_renewal_guidance.py
  • metrics_utility/test/library/test_collectors_prometheus_client.py
  • metrics_utility/test/library/test_collectors_util.py
  • metrics_utility/test/library/test_coverage_gaps.py
  • metrics_utility/test/library/test_dataframe_stubs.py
  • metrics_utility/test/library/test_storage_crc.py
  • metrics_utility/test/library/test_storage_directory.py
  • metrics_utility/test/library/test_storage_s3.py
  • metrics_utility/test/library/test_storage_segment.py
  • metrics_utility/test/library/test_summarize_merge_ops.py
  • metrics_utility/test/test_anonymized_rollups/test_jobs_anonymized_rollups.py
  • metrics_utility/test/test_anonymized_rollups/test_multiple_files.py
  • metrics_utility/test/test_helpers.py
💤 Files with no reviewable changes (3)
  • metrics_utility/library/storage/util.py
  • metrics_utility/test/ccspv_reports/test_invalid_data_for_CCSP_and_CCSPv2.py
  • metrics_utility/test/dedup/test_factory.py

Comment thread metrics_utility/library/dataframes/base_traditional.py Outdated
Comment thread metrics_utility/library/storage/segment.py Outdated
Comment thread metrics_utility/test/library/test_storage_segment.py Outdated
…o_one' and enhance error handling in StorageSegment

- Changed the merge validation parameter from 'many_to_many' to 'one_to_one' in the merge methods of Base and BaseTraditional classes to ensure correct data merging behavior.
- Improved error handling in StorageSegment by refining the conditions for raising ValueErrors when invalid arguments are provided, ensuring that the 'dict' argument is required.
- Updated related tests to reflect the changes in error messages and validation logic.
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (6)
metrics_utility/test/base/functional/helpers.py (1)

64-67: Inconsistent with the S7508 cleanup applied elsewhere.

The same in files.keys() idiom was simplified throughout test_gathering.py and test_slicing.py in this PR but left here. For consistency:

♻️ Proposed cleanup
 def assert_common_files(files):
-    assert './config.json' in files.keys()
-    assert './manifest.json' in files.keys()
-    assert './data_collection_status.csv' in files.keys()
+    assert './config.json' in files
+    assert './manifest.json' in files
+    assert './data_collection_status.csv' in files
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics_utility/test/base/functional/helpers.py` around lines 64 - 67, In
assert_common_files, remove the unnecessary .keys() calls to match the S7508
cleanup elsewhere: update the three assertions in the assert_common_files
function to check membership directly in the mapping (e.g., assert
'./config.json' in files) instead of using in files.keys(); this keeps
consistency with test_gathering.py and test_slicing.py.
metrics_utility/library/utils.py (1)

30-37: No positional callers found; however, consider preserving explicit parameter names for the public API.

The search confirms all repository callers use keyword arguments (db=, key=, value=), so the **_kwargs signature is safe internally. Both functions are exported in library/__init__.py as public API. While the current change introduces no breakage within the repo, external consumers relying on the explicit signature could be affected. For a public API, preserving the parameter contract improves discoverability and reduces surprise:

♻️ Suggested alternative preserving the signature
-def last_gather(**_kwargs):
+def last_gather(_db=None, _key=None):
     log('library.utils last_gather')
     return None


-def save_last_gather(**_kwargs):
+def save_last_gather(_db=None, _key=None, _value=None):
     log('library.utils save_last_gather')
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics_utility/library/utils.py` around lines 30 - 37, The public functions
last_gather and save_last_gather were changed to accept only **_kwargs which can
surprise external callers; restore explicit parameter names in their signatures
(e.g., def last_gather(db=None, key=None, value=None, **_kwargs) and def
save_last_gather(db=None, key=None, value=None, **_kwargs)) so the public API
documents expected parameters while still allowing forward-compatible keyword
captures; update the function docstrings/comments if present and keep the
internal logging behavior intact (these functions are exported via
library.__init__ as public API).
metrics_utility/automation_controller_billing/report_saver/report_saver_s3.py (1)

18-19: Prefer any(...) over len(list(...)) > 0 to avoid materializing all keys.

s3_handler.list_files is a generator that paginates over S3 (paginator.paginate(...)); wrapping it in list(...) forces enumeration of every object under the prefix just to check existence, which is wasted I/O and memory when many files exist. Use any(...) to short-circuit on the first hit.

Proposed fix
-    def report_exist(self):
-        return len(list(self.s3_handler.list_files(self.report_spreadsheet_destination_path))) > 0
+    def report_exist(self):
+        return any(self.s3_handler.list_files(self.report_spreadsheet_destination_path))
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In
`@metrics_utility/automation_controller_billing/report_saver/report_saver_s3.py`
around lines 18 - 19, The report_exist method currently materializes all keys by
doing
len(list(self.s3_handler.list_files(self.report_spreadsheet_destination_path)))
> 0; change it to use a short-circuiting check with any(...) over the generator
returned by self.s3_handler.list_files(self.report_spreadsheet_destination_path)
so it returns True on the first found object and avoids building a full list.
metrics_utility/test/base/test_package.py (1)

11-23: Optional: prefer a real Package instance for has_free_space tests.

Calling Package.has_free_space(pkg, …) with a MagicMock(spec=Package) works today, but it bypasses the actual get_max_data_size classmethod and only exercises the inequality. Constructing a real Package(collector=Mock()) (or a thin subclass overriding MAX_DATA_SIZE) would also catch regressions in get_max_data_size itself.

Suggested approach
-def test_has_free_space_within_limit():
-    pkg = MagicMock(spec=Package)
-    pkg.total_data_size = 100
-    pkg.get_max_data_size.return_value = 200
-    assert Package.has_free_space(pkg, 99) is True
-    assert Package.has_free_space(pkg, 100) is True  # exactly at limit
+def test_has_free_space_within_limit(monkeypatch):
+    pkg = Package(collector=MagicMock())
+    pkg.total_data_size = 100
+    monkeypatch.setattr(Package, 'MAX_DATA_SIZE', 200)
+    assert pkg.has_free_space(99) is True
+    assert pkg.has_free_space(100) is True  # exactly at limit
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics_utility/test/base/test_package.py` around lines 11 - 23, Tests use
MagicMock(spec=Package) which bypasses Package.get_max_data_size behavior;
replace the MagicMock-based pkg with a real Package instance (e.g.,
Package(collector=Mock()) or a small test subclass that sets MAX_DATA_SIZE) so
that Package.has_free_space is exercised against the actual get_max_data_size
logic in tests test_has_free_space_within_limit and
test_has_free_space_exceeds_limit; ensure you set pkg.total_data_size
appropriately and call Package.has_free_space(pkg, ...) as before.
metrics_utility/anonymized_rollups/anonymized_rollups.py (1)

278-278: Optional: drop the redundant ternary.

sorted(empty_set) already returns [], so the if ansible_versions_set else [] guard is unnecessary.

♻️ Proposed simplification
-    return sorted(ansible_versions_set) if ansible_versions_set else []
+    return sorted(ansible_versions_set)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics_utility/anonymized_rollups/anonymized_rollups.py` at line 278, The
return expression currently uses a redundant ternary: "return
sorted(ansible_versions_set) if ansible_versions_set else []"; simplify it by
returning sorted(ansible_versions_set) directly since sorted(empty_set) already
yields [], i.e., replace that return with "return sorted(ansible_versions_set)"
referencing the ansible_versions_set variable and the existing sorted call.
metrics_utility/library/collectors/util.py (1)

22-23: TypeError is the right exception, but consider isinstance for subclass tolerance.

type(x) is not dict / type(x) is not list rejects valid subclasses (e.g., OrderedDict, collections.UserList). If you want to keep the exact-type semantics, fine — otherwise prefer isinstance for forward-compatibility.

♻️ Optional refactor
-        if type(data) is not dict:
+        if not isinstance(data, dict):
             raise TypeError('data must be a dict, or None')
-        if type(filenames) is not list:
+        if not isinstance(filenames, list):
             raise TypeError('filenames must be a list, or None')

Also applies to: 40-41

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@metrics_utility/library/collectors/util.py` around lines 22 - 23, Replace
strict type checks that use "type(x) is not dict" and "type(x) is not list" with
isinstance-based checks so subclasses are accepted; specifically update the
conditional guarding the variable "data" (currently using type(data) is not
dict) to use "isinstance(data, dict)" and likewise update the check around the
list variable (the check at lines ~40-41) to use isinstance(..., list) or, if
more appropriate, collections.abc.Sequence/UserList depending on intended
semantics, preserving the existing TypeError and error message text.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@metrics_utility/metric_utils.py`:
- Around line 11-53: JobHostSummarySchema methods unique_index_columns(),
data_columns(), cast_types(), and operations() currently return direct
references to module-level mutable lists/dicts (JOB_HOST_SUMMARY_INDEX_COLUMNS,
JOB_HOST_SUMMARY_DATA_COLUMNS, JOB_HOST_SUMMARY_CAST_TYPES,
JOB_HOST_SUMMARY_OPERATIONS); change those methods to return immutable views
(e.g., convert lists to tuples and dicts to types.MappingProxyType) or return
shallow copies (list(...) and dict(...)) so callers cannot mutate the shared
module-level objects and corrupt global schema state.

In `@metrics_utility/test/library/test_summarize_merge_ops.py`:
- Around line 31-34: The test test_summarize_combine_json_values uses a weak
assertion that only checks for key presence; update it to assert the exact
expected merged structure from summarize_merged_dataframes when using the
combine_json_values operation (e.g., verify result['col'].iloc[0] equals the
expected combined JSON object/collection), so the test asserts both keys and
values produced by combine_json_values and will fail if values are dropped or
combined incorrectly.
- Around line 19-34: Add a test that exercises multiple columns in one call to
ensure the lambda closure bug in summarize_merged_dataframes is prevented:
create a DataFrame with multiple pairs like 'a_x'/'a_y', 'b_x'/'b_y',
'c_x'/'c_y', call _bt().summarize_merged_dataframes(df, ['a','b','c'],
operations=ops) where ops =
{'a':'combine_json','b':'combine_set','c':'combine_json_values'}, and assert
each resulting column (accessed via result['a'], result['b'], result['c']) has
the expected merged values; name the test
test_summarize_lambda_branches_bind_per_column so it fails if the loop variable
`col` in summarize_merged_dataframes (and any lambda like lambda row: ...) is
not properly bound per-iteration.

---

Nitpick comments:
In `@metrics_utility/anonymized_rollups/anonymized_rollups.py`:
- Line 278: The return expression currently uses a redundant ternary: "return
sorted(ansible_versions_set) if ansible_versions_set else []"; simplify it by
returning sorted(ansible_versions_set) directly since sorted(empty_set) already
yields [], i.e., replace that return with "return sorted(ansible_versions_set)"
referencing the ansible_versions_set variable and the existing sorted call.

In
`@metrics_utility/automation_controller_billing/report_saver/report_saver_s3.py`:
- Around line 18-19: The report_exist method currently materializes all keys by
doing
len(list(self.s3_handler.list_files(self.report_spreadsheet_destination_path)))
> 0; change it to use a short-circuiting check with any(...) over the generator
returned by self.s3_handler.list_files(self.report_spreadsheet_destination_path)
so it returns True on the first found object and avoids building a full list.

In `@metrics_utility/library/collectors/util.py`:
- Around line 22-23: Replace strict type checks that use "type(x) is not dict"
and "type(x) is not list" with isinstance-based checks so subclasses are
accepted; specifically update the conditional guarding the variable "data"
(currently using type(data) is not dict) to use "isinstance(data, dict)" and
likewise update the check around the list variable (the check at lines ~40-41)
to use isinstance(..., list) or, if more appropriate,
collections.abc.Sequence/UserList depending on intended semantics, preserving
the existing TypeError and error message text.

In `@metrics_utility/library/utils.py`:
- Around line 30-37: The public functions last_gather and save_last_gather were
changed to accept only **_kwargs which can surprise external callers; restore
explicit parameter names in their signatures (e.g., def last_gather(db=None,
key=None, value=None, **_kwargs) and def save_last_gather(db=None, key=None,
value=None, **_kwargs)) so the public API documents expected parameters while
still allowing forward-compatible keyword captures; update the function
docstrings/comments if present and keep the internal logging behavior intact
(these functions are exported via library.__init__ as public API).

In `@metrics_utility/test/base/functional/helpers.py`:
- Around line 64-67: In assert_common_files, remove the unnecessary .keys()
calls to match the S7508 cleanup elsewhere: update the three assertions in the
assert_common_files function to check membership directly in the mapping (e.g.,
assert './config.json' in files) instead of using in files.keys(); this keeps
consistency with test_gathering.py and test_slicing.py.

In `@metrics_utility/test/base/test_package.py`:
- Around line 11-23: Tests use MagicMock(spec=Package) which bypasses
Package.get_max_data_size behavior; replace the MagicMock-based pkg with a real
Package instance (e.g., Package(collector=Mock()) or a small test subclass that
sets MAX_DATA_SIZE) so that Package.has_free_space is exercised against the
actual get_max_data_size logic in tests test_has_free_space_within_limit and
test_has_free_space_exceeds_limit; ensure you set pkg.total_data_size
appropriately and call Package.has_free_space(pkg, ...) as before.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Organization UI (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 9212e7f3-e340-4b40-aecb-a59972eb54f1

📥 Commits

Reviewing files that changed from the base of the PR and between 828c8c1 and 7c09af8.

📒 Files selected for processing (52)
  • metrics_utility/anonymized_rollups/anonymized_rollups.py
  • metrics_utility/anonymized_rollups/base_anonymized_rollup.py
  • metrics_utility/anonymized_rollups/credentials_anonymized_rollup.py
  • metrics_utility/anonymized_rollups/events_modules_anonymized_rollup.py
  • metrics_utility/automation_controller_billing/dataframe_engine/base.py
  • metrics_utility/automation_controller_billing/dataframe_engine/dataframe_jobhost_summary_usage.py
  • metrics_utility/automation_controller_billing/dedup/renewal_guidance.py
  • metrics_utility/automation_controller_billing/package/package_crc.py
  • metrics_utility/automation_controller_billing/report/base.py
  • metrics_utility/automation_controller_billing/report/report_ccsp_v2.py
  • metrics_utility/automation_controller_billing/report_saver/report_saver_s3.py
  • metrics_utility/base/package.py
  • metrics_utility/dataframe_schema.py
  • metrics_utility/library/collectors/controller/main_jobevent_service.py
  • metrics_utility/library/collectors/others/total_workers_vcpu.py
  • metrics_utility/library/collectors/util.py
  • metrics_utility/library/dataframes/base_dataframe.py
  • metrics_utility/library/dataframes/base_traditional.py
  • metrics_utility/library/dataframes/job_host_summary.py
  • metrics_utility/library/extractors.py
  • metrics_utility/library/storage/crc.py
  • metrics_utility/library/storage/directory.py
  • metrics_utility/library/storage/s3.py
  • metrics_utility/library/storage/segment.py
  • metrics_utility/library/storage/util.py
  • metrics_utility/library/utils.py
  • metrics_utility/management_utility.py
  • metrics_utility/metric_utils.py
  • metrics_utility/test/base/functional/helpers.py
  • metrics_utility/test/base/functional/test_gathering.py
  • metrics_utility/test/base/functional/test_slicing.py
  • metrics_utility/test/base/test_package.py
  • metrics_utility/test/ccspv_reports/dedup/test_complex_CCSPv2_with_canonical_facts/test_complex_CCSPv2_with_canonical_facts.py
  • metrics_utility/test/ccspv_reports/test_complex_CCSP_sanity_check.py
  • metrics_utility/test/ccspv_reports/test_complex_CCSPv2_sanity_check.py
  • metrics_utility/test/ccspv_reports/test_empty_data_for_CCSP_and_CCSPv2.py
  • metrics_utility/test/ccspv_reports/test_invalid_data_for_CCSP_and_CCSPv2.py
  • metrics_utility/test/conftest.py
  • metrics_utility/test/dedup/test_factory.py
  • metrics_utility/test/dedup/test_renewal_guidance.py
  • metrics_utility/test/library/test_collectors_prometheus_client.py
  • metrics_utility/test/library/test_collectors_util.py
  • metrics_utility/test/library/test_coverage_gaps.py
  • metrics_utility/test/library/test_dataframe_stubs.py
  • metrics_utility/test/library/test_storage_crc.py
  • metrics_utility/test/library/test_storage_directory.py
  • metrics_utility/test/library/test_storage_s3.py
  • metrics_utility/test/library/test_storage_segment.py
  • metrics_utility/test/library/test_summarize_merge_ops.py
  • metrics_utility/test/test_anonymized_rollups/test_jobs_anonymized_rollups.py
  • metrics_utility/test/test_anonymized_rollups/test_multiple_files.py
  • metrics_utility/test/test_helpers.py
💤 Files with no reviewable changes (3)
  • metrics_utility/library/storage/util.py
  • metrics_utility/test/ccspv_reports/test_invalid_data_for_CCSP_and_CCSPv2.py
  • metrics_utility/test/dedup/test_factory.py

Comment thread metrics_utility/metric_utils.py
Comment thread metrics_utility/test/library/test_summarize_merge_ops.py Outdated
Comment thread metrics_utility/test/library/test_summarize_merge_ops.py Outdated
- Return list/dict copies from JobHostSummarySchema methods to prevent
  callers from accidentally mutating the shared module-level constants
- Strengthen test_summarize_combine_json_values to assert the full
  {'k': {'v1', 'v2'}} structure rather than just key presence
- Add test_summarize_lambda_branches_bind_per_column: exercises all
  three lambda operations (combine_json, combine_set, combine_json_values)
  across multiple columns in a single call — the only scenario that would
  catch a regression of the c=col closure-capture fix (S1515)

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

@cshiels-ie cshiels-ie changed the title Fix SonarCloud issues [WIP] [After 2.7] Fix SonarCloud issues Apr 28, 2026
@SherinV SherinV requested a review from pmflanagan April 28, 2026 19:32
@cshiels-ie cshiels-ie changed the title [WIP] [After 2.7] Fix SonarCloud issues [After 2.7] Fix SonarCloud issues Apr 29, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants